Broomfield County
Park: An Open Platform for Learning-Augmented Computer Systems
Hongzi Mao, Parimarjan Negi, Akshay Narayan, Hanrui Wang, Jiacheng Yang, Haonan Wang, Ryan Marcus, ravichandra addanki, Mehrdad Khani Shirkoohi, Songtao He, Vikram Nathan, Frank Cangialosi, Shaileshh Venkatakrishnan, Wei-Hung Weng, Song Han, Tim Kraska, Dr.Mohammad Alizadeh
- South America (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (9 more...)
- Information Technology (1.00)
- Leisure & Entertainment > Games > Computer Games (0.93)
LocalBench: Benchmarking LLMs on County-Level Local Knowledge and Reasoning
Gao, Zihan, Xu, Yifei, Thebault-Spieker, Jacob
Large language models (LLMs) have been widely evaluated on macro-scale geographic tasks, such as global factual recall, event summarization, and regional reasoning. Yet, their ability to handle hyper-local knowledge remains poorly understood. This gap is increasingly consequential as real-world applications, from civic platforms to community journalism, demand AI systems that can reason about neighborhood-specific dynamics, cultural narratives, and local governance. Existing benchmarks fall short in capturing this complexity, often relying on coarse-grained data or isolated references. We present LocalBench, the first benchmark designed to systematically evaluate LLMs on county-level local knowledge across the United States. Grounded in the Localness Conceptual Framework, LocalBench includes 14,782 validated question-answer pairs across 526 U.S. counties in 49 states, integrating diverse sources such as Census statistics, local subreddit discourse, and regional news. It spans physical, cognitive, and relational dimensions of locality. Using LocalBench, we evaluate 13 state-of-the-art LLMs under both closed-book and web-augmented settings. Our findings reveal critical limitations: even the best-performing models reach only 56.8% accuracy on narrative-style questions and perform below 15.5% on numerical reasoning. Moreover, larger model size and web augmentation do not guarantee better performance, for example, search improves Gemini's accuracy by +13.6%, but reduces GPT-series performance by -11.4%. These results underscore the urgent need for language models that can support equitable, place-aware AI systems: capable of engaging with the diverse, fine-grained realities of local communities across geographic and cultural contexts.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- North America > United States > Colorado > Broomfield County > Broomfield (0.04)
- (3 more...)
- Media > News (1.00)
- Health & Medicine (0.93)
- Government > Regional Government > North America Government > United States Government (0.53)
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Russia (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- (5 more...)
- Information Technology > Security & Privacy (0.93)
- Information Technology > Services (0.93)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.92)
- Leisure & Entertainment (0.67)
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Russia (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Communications > Networks (0.93)
- Europe > Austria > Vienna (0.14)
- North America > United States > Colorado > Broomfield County > Broomfield (0.04)
- North America > United States > Colorado > Denver County > Denver (0.04)
- (5 more...)
- Workflow (0.68)
- Research Report > New Finding (0.68)
Hybrid Dual-Batch and Cyclic Progressive Learning for Efficient Distributed Training
Lu, Kuan-Wei, Hong, Ding-Yong, Liu, Pangfeng, Wu, Jan-Jan
Distributed machine learning is critical for training deep learning models on large datasets with numerous parameters. Current research primarily focuses on leveraging additional hardware resources and powerful computing units to accelerate the training process. As a result, larger batch sizes are often employed to speed up training. However, training with large batch sizes can lead to lower accuracy due to poor generalization. To address this issue, we propose the dual-batch learning scheme, a distributed training method built on the parameter server framework. This approach maximizes training efficiency by utilizing the largest batch size that the hardware can support while incorporating a smaller batch size to enhance model generalization. By using two different batch sizes simultaneously, this method improves accuracy with minimal additional training time. Additionally, to mitigate the time overhead caused by dual-batch learning, we propose the cyclic progressive learning scheme. This technique repeatedly and gradually increases image resolution from low to high during training, thereby reducing training time. By combining cyclic progressive learning with dual-batch learning, our hybrid approach improves both model generalization and training efficiency. Experimental results with ResNet-18 demonstrate that, compared to conventional training methods, our approach improves accuracy by 3.3% while reducing training time by 10.1% on CIFAR-100, and further achieves a 34.8% reduction in training time on ImageNet.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Colorado > Broomfield County > Broomfield (0.04)
- North America > United States > Georgia > Chatham County > Savannah (0.04)
- (2 more...)
Predictive Performance of Deep Quantum Data Re-uploading Models
Wang, Xin, Tao, Han-Xiao, Wu, Re-Bing
Quantum machine learning models incorporating data re-uploading circuits have garnered significant attention due to their exceptional expressivity and trainability. However, their ability to generate accurate predictions on unseen data, referred to as the predictive performance, remains insufficiently investigated. This study reveals a fundamental limitation in predictive performance when deep encoding layers are employed within the data re-uploading model. Concretely, we theoretically demonstrate that when processing high-dimensional data with limited-qubit data re-uploading models, their predictive performance progressively degenerates to near random-guessing levels as the number of encoding layers increases. In this context, the repeated data uploading cannot mitigate the performance degradation. These findings are validated through experiments on both synthetic linearly separable datasets and real-world datasets. Our results demonstrate that when processing high-dimensional data, the quantum data re-uploading models should be designed with wider circuit architectures rather than deeper and narrower ones.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Colorado > Broomfield County > Broomfield (0.04)
- (3 more...)
Quantum Natural Language Processing: A Comprehensive Review of Models, Methods, and Applications
Nausheen, Farha, Ahmed, Khandakar, Khan, M Imad, Riaz, Farina
In recent developments, deep learning methodologies applied to Natural Language Processing (NLP) have revealed a paradox: They improve performance but demand considerable data and resources for their training. Alternatively, quantum computing exploits the principles of quantum mechanics to overcome the computational limitations of current methodologies, thereby establishing an emerging field known as quantum natural language processing (QNLP). This domain holds the potential to attain a quantum advantage in the processing of linguistic structures, surpassing classical models in both efficiency and accuracy. In this paper, it is proposed to categorise QNLP models based on quantum computing principles, architecture, and computational approaches. This paper attempts to provide a survey on how quantum meets language by mapping state-of-the-art in this area, embracing quantum encoding techniques for classical data, QNLP models for prevalent NLP tasks, and quantum optimisation techniques for hyper parameter tuning. The landscape of quantum computing approaches applied to various NLP tasks is summarised by showcasing the specific QNLP methods used, and the popularity of these methods is indicated by their count. From the findings, it is observed that QNLP approaches are still limited to small data sets, with only a few models explored extensively, and there is increasing interest in the application of quantum computing to natural language processing tasks.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- (27 more...)
- Overview (1.00)
- Research Report > Promising Solution (0.67)
Convex Maneuver Planning for Spacecraft Collision Avoidance
Vega, Fausto, Arrizabalaga, Jon, Watson, Ryan, Manchester, Zachary
Conjunction analysis and maneuver planning for spacecraft collision avoidance remains a manual and time-consuming process, typically involving repeated forward simulations of hand-designed maneuvers. With the growing density of satellites in low-Earth orbit (LEO), autonomy is becoming essential for efficiently evaluating and mitigating collisions. In this work, we present an algorithm to design low-thrust collision-avoidance maneuvers for short-term conjunction events. We first formulate the problem as a nonconvex quadratically-constrained quadratic program (QCQP), which we then relax into a convex semidefinite program (SDP) using Shor's relaxation. We demonstrate empirically that the relaxation is tight, which enables the recovery of globally optimal solutions to the original nonconvex problem. Our formulation produces a minimum-energy solution while ensuring a desired probability of collision at the time of closest approach. Finally, if the desired probability of collision cannot be satisfied, we relax this constraint into a penalty, yielding a minimum-risk solution. We validate our algorithm with a high-fidelity simulation of a satellite conjunction in low-Earth orbit with a simulated conjunction data message (CDM), demonstrating its effectiveness in reducing collision risk.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Colorado > Broomfield County > Broomfield (0.04)
- Transportation (0.95)
- Aerospace & Defense (0.86)
Robust LLM Training Infrastructure at ByteDance
Wan, Borui, Liu, Gaohong, Song, Zuquan, Wang, Jun, Zhang, Yun, Sheng, Guangming, Wang, Shuguang, Wei, Houmin, Wang, Chenyuan, Lou, Weiqiang, Yang, Xi, Zhang, Mofan, Jiang, Kaihua, Ren, Cheng, Zhi, Xiaoyun, Yu, Menghan, Nan, Zhe, Zheng, Zhuolin, Zhong, Baoquan, Wang, Qinlong, Yu, Huan, Chi, Jinxin, Zhang, Wang, Li, Yuhan, Du, Zixian, Zhao, Sida, Zhang, Yongqiang, Tang, Jingzhe, Liu, Zherui, Wu, Chuan, Peng, Yanghua, Lin, Haibin, Xiao, Wencong, Liu, Xin, Xiang, Liang
The training scale of large language models (LLMs) has reached tens of thousands of GPUs and is still continuously expanding, enabling faster learning of larger models. Accompanying the expansion of the resource scale is the prevalence of failures (CUDA error, NaN values, job hang, etc.), which poses significant challenges to training stability. Any large-scale LLM training infrastructure should strive for minimal training interruption, efficient fault diagnosis, and effective failure tolerance to enable highly efficient continuous training. This paper presents ByteRobust, a large-scale GPU infrastructure management system tailored for robust and stable training of LLMs. It exploits the uniqueness of LLM training process and gives top priorities to detecting and recovering failures in a routine manner. Leveraging parallelisms and characteristics of LLM training, ByteRobust enables high-capacity fault tolerance, prompt fault demarcation, and localization with an effective data-driven approach, comprehensively ensuring continuous and efficient training of LLM tasks. ByteRobust is deployed on a production GPU platform and achieves 97% ETTR for a three-month training job on 9,600 GPUs.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > South Korea > Seoul > Seoul (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- (15 more...)
- Information Technology (0.47)
- Energy (0.46)